Referencing Row Number in R
Asked Answered
F

4

28

How do I reference the row number of an observation? For example, if you have a data.frame called "data" and want to create a variable data$rownumber equal to each observation's row number, how would you do it without using a loop?

Fibre answered 18/7, 2013 at 19:53 Comment(0)
B
32

These are present by default as rownames when you create a data.frame.

R> df = data.frame('a' = rnorm(10), 'b' = runif(10), 'c' = letters[1:10])
R> df
            a          b c
1   0.3336944 0.39746731 a
2  -0.2334404 0.12242856 b
3   1.4886706 0.07984085 c
4  -1.4853724 0.83163342 d
5   0.7291344 0.10981827 e
6   0.1786753 0.47401690 f
7  -0.9173701 0.73992239 g
8   0.7805941 0.91925413 h
9   0.2469860 0.87979229 i
10  1.2810961 0.53289335 j

and you can access them via the rownames command.

R> rownames(df)
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"

if you need them as numbers, simply coerce to numeric by adding as.numeric, as in as.numeric(rownames(df)).

You don't need to add them, as if you know what you are looking for (say item df$c == 'i', you can use the which command:

R> which(df$c =='i')
[1] 9

or if you don't know the column

R> which(df == 'i', arr.ind=T)
     row col
[1,]   9   3

you may access the element using df[9, 'c'], or df$c[9].

If you wanted to add them you could use df$rownumber <- as.numeric(rownames(df)), though this may be less robust than df$rownumber <- 1:nrow(df) as there are cases when you might have assigned to rownames so they will no longer be the default index numbers (the which command will continue to return index numbers even if you do assign to rownames).

Bill answered 18/7, 2013 at 20:17 Comment(0)
D
13

Simply:

data$rownumber = 1:nrow(Data)
Danged answered 18/7, 2013 at 20:4 Comment(3)
I can't think of a time when this would be useful though. Especially given the function whichDatha
It's useful if you need a sorting index.Anathematize
I have a dataframe with two positional variables (say "Plot" and "Fruit_number") but at each position I have seven measurements. I want one of them, but I do not have a unique identifier. I can use filter and the mod function on row numbers to select a value. from each fruit within a plot. dplyr::filter(row_number() && 4 == 1)Stambul
S
6

Perhaps with dataframes, one of the easiest and most practical solutions is:

data = dplyr::mutate(data, rownum=row_number())
Supererogation answered 17/4, 2020 at 19:47 Comment(0)
I
3

This is probably the simplest way:

data$rownumber = 1:dim(data)[1]

It's probably worth noting that if you want to select a row by its row index, you can do this with simple bracket notation

data[3,]

vs.

data[data$rownumber==3,]

So I'm not really sure what this new column accomplishes.

Inverson answered 18/7, 2013 at 19:58 Comment(2)
You can use nrow(data) instead of dim(data)[1].Eclat
rownames are characters not numeric. That might lead to confusion.Anathematize

© 2022 - 2024 — McMap. All rights reserved.